Merge dev update to main: RAG agent/graph for XANES, memory/database, evaluation by tdpham2 · Pull Request #112 · argonne-lcf/ChemGraph

tdpham2 · 2026-04-17T15:49:07Z

Summary

Major improvements for ChemGraph:

Add memory storage via SQLite: 6506af2 and ability to see/resume previous session.
Add a RAG agent: cd3d63f
Add single agent for XANES spectra simulation: 48e78ca
Add new evaluation pipeline, dataset and CLI: 5a4076f
Fix MACE/Pytorch issue: 6371f7b
Fix CLI bugs Refactor CLI, unify model routing, and fix logging/config bugs #113 Fix UI/CLI #114
Fix multi-agent Refactor multi-agent to Send()-based architecture, remove multi_agent_mcp, and clean up MCP examples #115

Update dev from main

Add RAG Agent to ChemGraph

Add session memory persistence with CLI session management.

Add evaluation & benchmarking module with LLM-as-judge and documentation

Add OpenCode MCP configuration for using ChemGraph tools directly

Add missing evaluation documentations and update config.toml

…_truth.json

Add structured output evaluation with checkpointing and FormatterAgent retry

tdpham2 · 2026-04-17T15:50:55Z

@keceli I will keep this PR open for now so I can track the major changes before asking you to review it.

… logging/config bugs - Move CLI from src/ui/cli.py to src/chemgraph/cli/ (main.py, commands.py, formatting.py) - Extract shared async helper into chemgraph.utils.async_utils - Update pyproject.toml entry point: ui.cli:main -> chemgraph.cli:main - Merge supported_argoproxy_models into supported_argo_models with argo: prefix - Switch Groq to prefix-based routing (groq:<model>) instead of curated list - Add ALCF model support with Globus OAuth auth flow and default base URL - Fix logging to use stderr instead of stdout (critical for MCP stdio transport) - Add configure_logging() for package-wide verbosity control (-v/-vv flags) - Replace bare print() debug statements with proper logger calls - Add --base-url CLI flag and workflow aliases (python_repl -> python_relp) - Use ThreadPoolExecutor for cross-platform timeout instead of Unix signals - Add default section merging to config loader so partial configs don't crash - Add [api.alcf] config section and DEPLOYMENT.md - Add dependency_tests.yml CI workflow

- anthropic.py: Replace all OpenAI references with Anthropic in logs, prompts, docstring, and critically fix env var from OPENAI_API_KEY to ANTHROPIC_API_KEY so auth retry actually works - openai.py: Fix argo:gpt-5.4 wire name from 'gpt52' to 'gpt54'

Refactor CLI, unify model routing, and fix logging/config bugs

… logic - Rewrite multi_agent.py with LangGraph Send() pattern for parallel executor fan-out (Planner -> Send(executor_subgraph) -> Planner) - Replace sequential worker loop with independent executor subgraphs that each run a ReAct loop (executor_agent -> ToolNode -> executor_agent) - Add retry logic with error feedback for planner and response agent JSON parsing failures - Reduce multi_agent_mcp.py to a thin wrapper delegating to construct_multi_agent_graph (ToolNode handles MCP tools natively) - Rename formatter_max_retries to max_retries for consistency - Update PlannerResponse schema, state definitions (ExecutorState, PlannerState), and multi-agent prompts for new architecture - Update single_agent.py and ase_tools.py - Update tests for new schemas and add planner fallback retry tests

…gent Since LangGraph's ToolNode handles both sync LangChain tools and async MCP tools natively, the separate multi_agent_mcp workflow is redundant. The multi_agent workflow now works identically with MCP tools passed via the tools parameter. - Delete src/chemgraph/graphs/multi_agent_mcp.py - Remove multi_agent_mcp import, workflow_map entry, and dispatch branch from llm_agent.py - Remove from CLI workflow choices and eval config valid types - Remove from test_graphs.py and test_graph_constructors.py - Update docs/evaluation.md and SKILL.md workflow tables

Replace duplicated MCP server code in example scripts with references to the canonical chemgraph.mcp.mcp_tools module. This eliminates ~960 lines of duplicated tool definitions that were drifting out of sync. - Delete scripts/mcp_example/mcp_stdio/mcp_tools_stdio.py (484 lines) - Replace scripts/mcp_example/mcp_http/start_mcp_server.py (486 lines) with a 24-line thin wrapper importing chemgraph.mcp.mcp_tools - Update stdio run scripts to spawn 'python -m chemgraph.mcp.mcp_tools' instead of referencing local server copies - Add scripts/mcp_example/mcp_stdio/run_chemgraph_multi_agent.py for local multi-agent MCP testing using the multi_agent workflow - Update start_mcp_server.sub to use the module directly - Update both README files with corrected instructions and port numbers

Refactor multi-agent to Send()-based architecture, remove multi_agent_mcp, and clean up MCP examples

tdpham2 and others added 30 commits March 10, 2026 12:32

Add MCP instructions for using ChemGraph MCP with OpenCode

b67bff0

Add status to current OpenCode/MCP integration

303ca53

Initial push for ChemGraph-RAG agent

0dc658e

Add a demo for Rag-agent using Argo

d690427

Merge pull request #96 from argonne-lcf/main

93b6282

Update dev from main

Fix lintting

038f9da

Merge pull request #98 from argonne-lcf/dev-rag

cd3d63f

Add RAG Agent to ChemGraph

Increase max character limit for title

84f9012

Update llm_agent to write to database

cde7e2d

Add CLI for memory-related functions

a29f43c

Add an example for memory-related operations

3b7447b

Update documentations

c5fca55

Add tests for creating and logging session

c5fbf1d

Add test for memory

0761c2e

Merge pull request #101 from argonne-lcf/dev-memory

6506af2

Add session memory persistence with CLI session management.

Update how log_dir is initialized

a2ae6a4

remove deepdiff from pyproject.toml

2b78a90

Update vibrational analysis output to avoid overwritten files

40924f3

Update CLI for evaluation

caf7ae0

Update default tools for single_agent

e2119e5

Update chemgraph.eval module

22e9b00

Merge pull request #105 from argonne-lcf/dev-eval

3b045ac

Add evaluation & benchmarking module with LLM-as-judge and documentation

Add instructions for using OpenCode + ChemGraph MCP tools on Aurora

a533a0f

Rename examples to chemgraph_opencode

da9379e

Merge pull request #94 from argonne-lcf/dev-opencode

d1694dd

Add OpenCode MCP configuration for using ChemGraph tools directly

Add evaluation documentations

aa0760f

Update config.toml for evaluation

5593209

Merge pull request #106 from argonne-lcf/dev-eval

10a52e2

Add missing evaluation documentations and update config.toml

Add XANES tools and update single-agent graph logic

3b03cd6

fix dup bug

d32369d

tdpham2 and others added 12 commits April 13, 2026 09:20

Update ground_truth data for evaluation and script to generate ground…

a42dd58

…_truth.json

Update ase input schema

1b02004

Add structured_output option to CLI

fcc5121

Update structured_output to default evaluation mode

561752d

Add structured_output_judrge

0b1fc19

Add retry to FormatterAgent

a0f8df6

Add formatter_max_retries to agent initialization

54c4a34

Update evaluation logic when FormatterAgent fails

0a8a5b1

Add checkpointing to evaluation

b7e2d3c

Add --resume option to config and cli to restart evaluation

6bef089

Merge pull request #111 from argonne-lcf/dev-eval

5a4076f

Add structured output evaluation with checkpointing and FormatterAgent retry

Add thread lock to MACE

6371f7b

tdpham2 changed the title ~~Merge recent update to main: RAG agent/graph for XANES, memory/database, evaluation~~ Merge dev update to main: RAG agent/graph for XANES, memory/database, evaluation Apr 17, 2026

tdpham2 self-assigned this Apr 17, 2026

tdpham2 and others added 15 commits April 22, 2026 10:48

Merge pull request #113 from argonne-lcf/dev-bugfix

5040c10

Refactor CLI, unify model routing, and fix logging/config bugs

Fix linting

7822dae

Fix linting for examples/ and scripts/

ce66256

Fix linting for tests/

4277336

Fix test_memory.py

b78cc98

Fix test memory for windows

403a6c9

Move test-pypi to workflow dispatch only

338e72f

Merge pull request #115 from argonne-lcf/dev-multiagent

7a2ab89

Refactor multi-agent to Send()-based architecture, remove multi_agent_mcp, and clean up MCP examples

Add missing parsing functionality

f3e95aa

Remove debug log from ase_tools.py

a5ccb97

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge dev update to main: RAG agent/graph for XANES, memory/database, evaluation#112

Merge dev update to main: RAG agent/graph for XANES, memory/database, evaluation#112
tdpham2 wants to merge 86 commits intomainfrom
dev

tdpham2 commented Apr 17, 2026 •

edited

Loading

Uh oh!

tdpham2 commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tdpham2 commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tdpham2 commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tdpham2 commented Apr 17, 2026 •

edited

Loading